Fully Delexicalized Contexts for Syntax-Based Word Embeddings
نویسندگان
چکیده
● We propose fully delexicalized contexts derived from syntactic trees to train word embeddings ● We demonstrate and evaluate our embeddings compared to vanilla word2vec ○ Nearest neighbours ○ Correlation to human judgement ○ Dependency parsing
منابع مشابه
Delexicalized Word Embeddings for Cross-lingual Dependency Parsing
This paper presents a new approach to the problem of cross-lingual dependency parsing, aiming at leveraging training data from different source languages to learn a parser in a target language. Specifically, this approach first constructs word vector representations that exploit structural (i.e., dependency-based) contexts but only considering the morpho-syntactic information associated with ea...
متن کاملTurkuNLP: Delexicalized Pre-training of Word Embeddings for Dependency Parsing
We present the TurkuNLP entry in the CoNLL 2017 Shared Task on Multilingual Parsing from Raw Text to Universal Dependencies. The system is based on the UDPipe parser with our focus being in exploring various techniques to pre-train the word embeddings used by the parser in order to improve its performance especially on languages with small training sets. The system ranked 11th among the 33 part...
متن کاملDual Embeddings and Metrics for Relational Similarity
Abstract. In this work, we study the problem of relational similarity by combining different word embeddings learned from different types of contexts. The word2vec model with linear bag-ofwords contexts can capture more topical and less functional similarity, while the dependency-based word embeddings with syntactic contexts can capture more functional and less topical similarity. We explore to...
متن کاملDependency-Based Word Embeddings
While continuous word embeddings are gaining popularity, current models are based solely on linear contexts. In this work, we generalize the skip-gram model with negative sampling introduced by Mikolov et al. to include arbitrary contexts. In particular, we perform experiments with dependency-based contexts, and show that they produce markedly different embeddings. The dependencybased embedding...
متن کاملTransferring Coreference Resolvers with Posterior Regularization
We propose a cross-lingual framework for learning coreference resolvers for resource-poor target languages, given a resolver in a source language. Our method uses word-aligned bitext to project information from the source to the target. To handle task-specific costs, we propose a softmax-margin variant of posterior regularization, and we use it to achieve robustness to projection errors. We sho...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017